library(here)
## here() starts at /Users/josephdekel/git_projects/fish-reflectance
here()
## [1] "/Users/josephdekel/git_projects/fish-reflectance"
here("Data", "varied_spectra.csv", "pablo_equation_1_transposed.csv" )
## [1] "/Users/josephdekel/git_projects/fish-reflectance/Data/varied_spectra.csv/pablo_equation_1_transposed.csv"
df <- read.csv("Data/varied_spectra.csv")
I first created synthetic data that would effectively demonstrate how different patches on fish can be distinguished using hyperspectral measurements. The varied_spectra.csv file was designed to contain five measurements per patch type, simulating natural variation while maintaining distinct spectral signatures and characteristic spectral patterns that mirror biological reality.
# Calculating spectral angle between two spectra (x and y)
calc_spectral_angle <- function(x, y)
{
sum_xy <- sum(x * y)
sum_x2 <- sum(x^2)
sum_y2 <- sum(y^2)
alfa <- acos(sum_xy / sqrt(sum_x2 * sum_y2))
return(alfa) # Returning angle in radians
}
The code implementation centres around the spectral angle calculation function, which implements the formula from the paper: α = cos⁻¹(ΣXY/√(Σ(X)²Σ(Y)²)). This calculation is crucial for identifying the most representative measurement for each patch type by finding the measurement closest to the mean. The function returns angles in radians, providing a standardized measure of similarity between spectra.
# Creating first plot (initialized with first measurement)
plot(df[,1], df[,2], # Plotting wavelength (col 1) vs first measurement (col 2)
type="l", # Line plot
ylim=c(0,100), # Y-axis from 0 to 100%
xlab="Wavelength (nm)",
ylab="Reflectance (%)",
main="Patch Measurements and Means")
# Defining colors and patch names for plotting
colors <- c("black", "blue", "orange", "gray") # Colour for each patch
patches <- c("black", "blue", "orange", "white") # Names of patches
# Ploting all individual measurements for each patch
for(patch in patches)
{
patch_cols <- grep(patch, names(df)) # Finding all columns for this patch
for(col in patch_cols) # Loop through each column of this patch
{
lines(df[,1], df[,col], # Adding line for each measurement
col=colors[which(patches == patch)]) # Using corresponding colour
}
}
# Calculating & plotting mean spectra
means <- data.frame(wavelength = df[,1]) # Creating dataframe for means
for(patch in patches)
{
patch_cols <- grep(patch, names(df)) # Finding columns for this patch
means[[patch]] <- rowMeans(df[,patch_cols]) # Calculating mean spectrum
lines(df[,1], means[[patch]], # Plotting mean as thicker line
col=colors[which(patches == patch)],
lwd=3)
}
# Legend for first plot
legend("topleft", patches, col=colors, lwd=2)
The plot displays all measurements and means, with individual measurements shown as thin lines and means as thicker lines in corresponding colours. This visualization demonstrates the variation within patches while maintaining clear distinction between patch types.
# Finding representative spectrum (smallest α) for each patch
rep_spectra <- data.frame(wavelength = df[,1]) # Creating dataframe for representative spectra (starting with wavelengths)
for(patch in patches)
{
patch_cols <- grep(patch, names(df)) # Finding all columns for this patch
mean_spec <- means[[patch]] # Getting mean spectrum we calculated earlier
angles <- numeric(length(patch_cols)) # Creating empty vector to store angles
# Calculating spectral angle between each measurement and mean
for(i in seq_along(patch_cols))
{
angles[i] <- calc_spectral_angle(df[,patch_cols[i]], mean_spec)
}
# Selecting measurement with smallest angle as representative
rep_spectra[[patch]] <- df[,patch_cols[which.min(angles)]]
# Printing the smallest angle found
cat(sprintf("%s patch - smallest angle: %f\n", patch, min(angles)))
}
## black patch - smallest angle: 0.005026
## blue patch - smallest angle: 0.001958
## orange patch - smallest angle: 0.000920
## white patch - smallest angle: 0.000547
# Creating second plot - representative spectra
plot(rep_spectra$wavelength, rep_spectra$black, # Initialize with black patch
type="l",
ylim=c(0,100),
xlab="Wavelength (nm)",
ylab="Reflectance (%)",
main="Representative Spectra (Smallest α)",
col="black")
# Adding lines for other patches
lines(rep_spectra$wavelength, rep_spectra$blue, col="blue")
lines(rep_spectra$wavelength, rep_spectra$orange, col="orange")
lines(rep_spectra$wavelength, rep_spectra$white, col="gray")
# Legend for second plot
legend("topleft", patches, col=colors, lwd=1)
This plot shows only the representative spectra (those closest to the mean), providing a cleaner visualization for quick comparison between patches.
The data processing flow moves systematically from reading the CSV file through plotting all measurements, calculating means, finding representative spectra using spectral angles, and finally creating the comparative plots. The implementation uses nested loops for efficiency, with the outer loop processing each patch type and the inner loop handling individual measurements. The grep function identifies relevant columns for each patch type. In the code, the wavelength column positioned first for easy reference.
Colour choices in the visualization match the actual patch colours for intuitive interpretation, with line thickness differentiating means from individual measurements. The y-axis range of 0-100% covers the full possible reflectance range, while the wavelength range matches the visible spectrum. This approach provides a practical demonstration of the spectral analysis methods described in the Piranha paper, allowing both examination of variation within patches and comparison between different patch types.
Trying to explore the methods of using Alfa Angle as a tool for comparing reflectance spectra (as seen in the piranha paper https://www.nature.com/articles/s41598-021-95713-0).
This code analyzes Trigger Fish patch spectral data to determine:
Whether we can tell different patch types apart (black, white, orange, blue)
How much variation exists within patches vs. across individuals
# =============================================================================
# Patches Variation Analysis
# This code analyzes Trigger Fish patch spectral data to determine:
# 1. Whether we can tell different patch types apart (black, white, orange, blue)
# 2. How much variation exists within patches vs. across individuals
# =============================================================================
df <- read.csv("Data/pablo_equation_1_transposed.csv")
# This function calculates the angular difference between two spectral signatures
# Based on the formula from the piranha paper: α = cos⁻¹(ΣXY/√(Σ(X)²Σ(Y)²))
# Smaller angles indicate greater similarity between spectra
calc_spectral_angle <- function(x, y)
{
sum_xy <- sum(x * y)
sum_x2 <- sum(x^2)
sum_y2 <- sum(y^2)
alfa <- acos(sum_xy / sqrt(sum_x2 * sum_y2))
return(alfa) # Return angle in radians
}
This function calculates the angular difference between two spectral signatures. Based on the formula from the piranha paper: α = cos⁻¹(ΣXY/√(Σ(X)²Σ(Y)²)). Smaller angles indicate greater similarity between spectra.
# Defining our patch types (colors) and individual identifiers (repeats)
patch_types <- c("black", "white", "orange", "blue") # The four patch types
individuals <- c("01", "02", "03", "04") # The four individuals (repeats)
# =============================================================================
# Organizing the data by patch type and individual
# =============================================================================
# Creating a nested list structure to easily access data:
# patch_columns$black$01 will contain the spectral data for black patch on individual 01
patch_columns <- list()
for(patch in patch_types)
{
# Creating a sub-list for each patch type
patch_columns[[patch]] <- list()
# For each individual, finding and storing the corresponding column data
for(ind in individuals)
{
col_name <- paste0(patch, "_", ind) # Constructing column name (e.g., "black_01")
# Only adding data if the column exists in our dataset
if(col_name %in% names(df))
{
patch_columns[[patch]][[ind]] <- df[[col_name]] # Storing the spectral data
}
}
}
patch_means <- data.frame(wavelength = df$wavelength) # Starting with wavelength column
for(patch in patch_types)
{
# Combining all measurements for this patch type into a matrix
patch_data <- do.call(cbind, patch_columns[[patch]])
# Calculating row means (mean reflectance at each wavelength)
patch_means[[patch]] <- rowMeans(patch_data, na.rm = TRUE)
}
# Defining colors for plotting that match the patch types
colors <- c("black", "gray", "orange", "blue")
# Starting with the first patch (black)
plot(df$wavelength, patch_means$black,
type="l", # Line plot
col=colors[1], # black color
lwd=2,
ylim=c(0, max(patch_means[,-1], na.rm=TRUE) * 1.1), # Y-axis range with 10% margin
xlab="Wavelength (nm)",
ylab="Reflectance (%)",
main="Mean Spectra by Patch Type")
# Adding lines for the remaining patch types
for(i in 2:length(patch_types))
{
lines(df$wavelength, # X values (wavelength)
patch_means[[patch_types[i]]], # Y values (reflectance)
col=colors[i], # Color corresponding to patch type
lwd=2)
}
# Legend to identify each line
legend("topright",
patch_types, # Labels for legend
col=colors, # Colors for legend
lwd=2)
The image shows:
The mean spectral reflectance curves for each patch type (black, white, orange, blue) across the visible spectrum (400-700nm).
Each line represents the average reflectance pattern for a specific patch type.
What it tells us:
There are clear, distinctive spectral signatures for each patch type.
White patches have the highest overall reflectance (70-90%), as expected.
Orange patches show moderate reflectance (20-45%) with increasing reflectance at longer wavelengths (characteristic of orange coloration).
Blue patches show specific reflectance patterns with a slight peak around 550-580nm.
Black patches have the lowest reflectance overall (below 15%).
Importance:
This confirms that different patch types have fundamentally different spectral signatures.
These distinctive signatures form the basis for using hyperspectral analysis to identify and differentiate patches.
# This quantifies how different each patch type is from the others
cat("Spectral angles between patch types (in radians):\n")
## Spectral angles between patch types (in radians):
# Creating a matrix to store the angles between each pair of patch types
patch_angles <- matrix(0, # Initializing with zeros
nrow=length(patch_types), # Rows = number of patch types
ncol=length(patch_types)) # Columns = number of patch types
# Adding row and column names to the matrix for clarity
rownames(patch_angles) <- patch_types
colnames(patch_angles) <- patch_types
# Calculating the spectral angle between each pair of patch types
for(i in 1:length(patch_types))
{
for(j in 1:length(patch_types))
{
if(i != j) # Skipping scenario comparing a patch with itself (angle would be 0)
{
# Calculating spectral angle between mean spectra of two patch types
patch_angles[i,j] <- calc_spectral_angle(
patch_means[[patch_types[i]]], # Mean spectrum of first patch type
patch_means[[patch_types[j]]] # Mean spectrum of second patch type
)
}
}
}
# Printing the matrix of angles (rounded to 4 decimal places)
print(round(patch_angles, 4))
## black white orange blue
## black 0.0000 0.2244 0.1224 0.1802
## white 0.2244 0.0000 0.2131 0.1979
## orange 0.1224 0.2131 0.0000 0.1065
## blue 0.1802 0.1979 0.1065 0.0000
The figure above shows:
A matrix of spectral angles between the mean spectra of different patch types.
Values range from 0 (identical) to higher values (more different).
What this tells us:
Black and orange patches are most similar to each other (angle = 0.1224).
White and orange patches are most different (angle = 0.2131).
Blue patches are most similar to orange patches (angle = 0.1065).
White patches are consistently the most different from other patch types.
Importance:
This quantifies exactly how different each patch type is from the others.
Shows which patches might be most easily confused with each other.
Provides a numerical basis for distinguishing patches in classification tasks.
# This measures how much variation exists within the same patch type across different individuals
within_variation <- list()
for(patch in patch_types)
{
within_variation[[patch]] <- c() # Initializing empty vector for this patch
# Comparing all possible pairs of individuals for this patch type
for(i in 1:(length(individuals)-1))
{
for(j in (i+1):length(individuals))
{
ind1 <- individuals[i]
ind2 <- individuals[j]
# Checking if we have data for both individuals for this patch
if(!is.null(patch_columns[[patch]][[ind1]]) && !is.null(patch_columns[[patch]][[ind2]]))
{
# Calculating spectral angle between the same patch type on different individuals
angle <- calc_spectral_angle(
patch_columns[[patch]][[ind1]], # Patch on first individual
patch_columns[[patch]][[ind2]] # Same patch on second individual
)
# Storing the angle in our list
within_variation[[patch]] <- c(within_variation[[patch]], angle)
}
}
}
}
# This measures how different patch types are within the same individual
# (For example, how different is black from blue on individual 01)
between_variation <- list()
for(ind in individuals)
{
between_variation[[ind]] <- c() # Initializing empty vector for this individual
# Comparing all possible pairs of patch types for this individual
for(i in 1:(length(patch_types)-1))
{
for(j in (i+1):length(patch_types))
{
patch1 <- patch_types[i]
patch2 <- patch_types[j]
# Checking if we have data for both patch types for this individual
if(!is.null(patch_columns[[patch1]][[ind]]) && !is.null(patch_columns[[patch2]][[ind]]))
{
# Calculating spectral angle between different patch types on the same individual
angle <- calc_spectral_angle(
patch_columns[[patch1]][[ind]], # First patch type on this individual
patch_columns[[patch2]][[ind]] # Second patch type on this individual
)
# Storing the angle in our list
between_variation[[ind]] <- c(between_variation[[ind]], angle)
}
}
}
}
# Preparing data for boxplot by flattening the lists
within_all <- unlist(within_variation) # Combining all within-patch angles
between_all <- unlist(between_variation) # Combining all between-patch angles
# Creating a data frame for plotting
variation_data <- data.frame(
Variation = c(rep("Within Patch", length(within_all)), # Labels for within-patch data
rep("Between Patches", length(between_all))),# Labels for between-patch data
Angle = c(within_all, between_all) # All angle values
)
# Creating the boxplot
boxplot(Angle ~ Variation, # Formula: Angle grouped by Variation type
data=variation_data, # Data to plot
main="Variation Within vs. Between Patches",
ylab="Spectral Angle (radians)",
col=c("lightblue", "lightgreen"))
Within-patch variation is significantly lower than between-patch variation.
# Printing summary stats for within-patch variation
cat("\nSummary of within-patch variation:\n")
##
## Summary of within-patch variation:
within_summary <- summary(within_all)
print(within_summary)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.01435 0.03265 0.05766 0.06285 0.06473 0.21041
cat("Standard deviation:", sd(within_all), "\n")
## Standard deviation: 0.04891543
# Printing summary stats for between-patch variation
cat("\nSummary of between-patch variation:\n")
##
## Summary of between-patch variation:
between_summary <- summary(between_all)
print(between_summary)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.1053 0.1417 0.2046 0.1832 0.2188 0.2606
cat("Standard deviation:", sd(between_all), "\n")
## Standard deviation: 0.04737195
# =============================================================================
# Statistical test: Are within-patch and between-patch variations different?
# =============================================================================
# Performing t-test to check if the difference is statistically significant
t_result <- t.test(within_all, between_all)
cat("\nT-test comparing within-patch vs. between-patch variation:\n")
##
## T-test comparing within-patch vs. between-patch variation:
print(t_result)
##
## Welch Two Sample t-test
##
## data: within_all and between_all
## t = -8.6586, df = 45.953, p-value = 3.252e-11
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -0.14833038 -0.09237165
## sample estimates:
## mean of x mean of y
## 0.06285286 0.18320387
The figure above shows:
Results of a t-test comparing within-patch and between-patch angles.
Extremely significant p-value (3.252e-11) confirming the difference.
Mean values: within-patch = 0.0629, between-patch = 0.1832.
What this tells us:
Within-patch variation is significantly lower than between-patch variation.
The 95% confidence interval for the difference is between 0.092 and 0.148 radians.
The difference is not due to chance (p << 0.001).
# Creating a matrix of all pairwise spectral angle comparisons
# First, getting all column names that contain patch measurements
all_columns <- c()
for(patch in patch_types)
{
for(ind in individuals)
{
col_name <- paste0(patch, "_", ind) # Constructing column name (e.g., "black_01")
if(col_name %in% names(df))
{
all_columns <- c(all_columns, col_name) # Adding to our list if it exists
}
}
}
# Creating matrix to store all pairwise spectral angles
angle_matrix <- matrix(0, # Initializing with zeros
nrow=length(all_columns), # Rows = number of columns
ncol=length(all_columns)) # Columns = number of columns
# Adding row and column names for clarity
rownames(angle_matrix) <- all_columns
colnames(angle_matrix) <- all_columns
# Calculating spectral angle between each pair of measurements
for(i in 1:length(all_columns))
{
for(j in 1:length(all_columns))
{
if(i != j) # Skipping scenario comparing a column with itself (angle would be 0)
{
# Calculating spectral angle between two measurements
angle_matrix[i,j] <- calc_spectral_angle(
df[[all_columns[i]]], # First measurement
df[[all_columns[j]]] # Second measurement
)
}
}
}
# Creating a heatmap visualization of all pairwise spectral angles
# This shows clusters of similar measurements
heatmap(angle_matrix,
main="Spectral Angles Between All Measurements",
xlab="Measurement",
ylab="Measurement",
col=heat.colors(100)) # Color palette (100 shades)
The image above shows:
Hierarchical clustering and heatmap of spectral angles between all individual measurements.
Darker colors indicate higher similarity (lower spectral angle).
The tree diagram shows hierarchical relationships based on spectral similarity.
What this tells us:
Measurements cluster primarily by patch type, not by individual.
All four orange patches cluster together, as do all black patches.
Blue patches mostly cluster together (blue_01 is an outlier).
White patches form a distinct cluster.
Importance:
Confirms that patch type is the primary determinant of spectral signature.
Shows that the spectral angle method successfully groups similar patches.
Identifies potential anomalies (like blue_01) that may need further investigation.
The hierarchical clustering provides an unbiased grouping method.
# Setting up a 2x2 grid for plotting the 4 patch types
par(mfrow=c(2,2)) # 2 rows, 2 columns of plots
# Creating one plot for each patch type
for(patch in patch_types)
{
# Creating an empty plot with appropriate limits
plot(df$wavelength, df[[paste0(patch, "_01")]],
type="n", # No points or lines initially (will add them in a sec)
# Calculating appropriate y-axis limits
ylim=c(0, max(sapply(individuals, function(ind)
{
col <- paste0(patch, "_", ind)
if(col %in% names(df)) max(df[[col]], na.rm=TRUE) else 0
}) * 1.1)), # Adding 10% margin
xlab="Wavelength (nm)",
ylab="Reflectance (%)",
main=paste("Variation in", patch, "patch"))
# Adding lines for each individual's spectrum for this patch type
for(ind in individuals)
{
col_name <- paste0(patch, "_", ind) # Constructing column name
if(col_name %in% names(df))
{
# Plotting this individual's spectrum with a unique color
lines(df$wavelength, df[[col_name]],
col=rainbow(length(individuals))[match(ind, individuals)])
}
}
# Adding the mean spectrum as a thick black line
lines(df$wavelength, patch_means[[patch]], col="black", lwd=2)
# Legend to identify each line
legend("topright",
c(individuals, "Mean"),
col=c(rainbow(length(individuals)), "black"),
lwd=c(rep(1, length(individuals)), 2),
cex=0.7)
}
# Resetting the plotting layout to default (1x1)
par(mfrow=c(1,1))
The image above shows:
Four panels showing individual variation within each patch type.
Each panel shows spectral curves for each repeat (individual) and the mean.
What this tells us:
Black patches: Relatively consistent across individuals with slight variation around 600nm.
White patches: Individual 01 shows lower reflectance than others; 02-04 are very similar.
Orange patches: Individual 01 shows higher reflectance; Individual 03 shows lower reflectance.
Blue patches: Most variable patch type; Individual 03 shows distinctive peak around 580nm.
Importance:
Reveals detailed patterns of individual variation within each patch type.
Shows which patch types are most consistent or variable across individuals.
Helps identify potentially unusual individuals (e.g., Individual 01 for white patches).
# Calculating and displaying the average spectral angle within each patch type
cat("\nAverage spectral angle within each patch type:\n")
##
## Average spectral angle within each patch type:
for(patch in patch_types)
{
if(length(within_variation[[patch]]) > 0)
{
# If we have data, calculating and printing the mean
cat(patch, ":", mean(within_variation[[patch]]), "\n")
}
else
{
# If no data is available, printing a message
cat(patch, ": No data available\n")
}
}
## black : 0.05614816
## white : 0.0333581
## orange : 0.04366843
## blue : 0.1182368
# Calculating and displaying the average spectral angle between patches for each individual
cat("\nAverage spectral angle between patches for each individual:\n")
##
## Average spectral angle between patches for each individual:
for(ind in individuals)
{
if(length(between_variation[[ind]]) > 0)
{
# If we have data, calculating and printing the mean
cat("Individual", ind, ":", mean(between_variation[[ind]]), "\n")
}
else
{
# If no data is available, printing a message
cat("Individual", ind, ": No data available\n")
}
}
## Individual 01 : 0.1779697
## Individual 02 : 0.1813528
## Individual 03 : 0.1850985
## Individual 04 : 0.1883944
Add a picture of Pablo and of Museum Specimens
Add accession numbers (specimen number on jar) & year of preservation
Methods section of how we collected data
Boxplots illustrating patches variation
for each individual, for each patch: x axis has 11 individuals, y axis is the mean and variation in reflectance (over the 5 repeats) (so we need 4 different graphs because there are 4 patches)
1 plot with 4 boxes (because 4 patches), 4 boxplots in one graph where we take the mean values get the mean and variance for each patch using the repeats and means from above
Do a nested ANOVA on the mean reflectance. Input all of the measures for all of the individuals for all of the patches. 11(individuals) x 5(imaging sessions (repeats)) x 4 (patches) = 220 rows of data. Basically: how much variation is being explained by individuals, repeated measures and patch.
Boxplots were generated to illustrate reflectance variations for each patch.
A nested ANOVA was performed to evaluate the contribution of individuals, repeated measures, and patches to overall reflectance variation.
Imaging was conducted under controlled lighting conditions.
To minimize the effect of camera heating on reflectance measurements, we randomized the order in which the 11 fish were imaged at the start of each session.
For each session, all 11 fish were imaged once.
After completing the first round of imaging, the hyperspectral camera was turned off and allowed to cool down for 3 minutes.
After the cooldown period, all 11 fish were imaged again, following the same procedure.
This process was repeated across a total of 5 imaging sessions, resulting in 5 repeat measurements for each fish.